Overview

Dataset statistics

Number of variables30
Number of observations233417
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory53.6 MiB
Average record size in memory241.0 B

Variable types

Numeric9
Categorical21

Alerts

model_name has a high cardinality: 1143 distinct values High cardinality
df_index is highly correlated with price and 1 other fieldsHigh correlation
engine_displacement is highly correlated with engine_powerHigh correlation
mileage is highly correlated with model_date and 7 other fieldsHigh correlation
model_date is highly correlated with mileage and 6 other fieldsHigh correlation
year is highly correlated with mileage and 6 other fieldsHigh correlation
owners is highly correlated with mileage and 6 other fieldsHigh correlation
price is highly correlated with df_index and 8 other fieldsHigh correlation
data_type is highly correlated with df_index and 1 other fieldsHigh correlation
state_new is highly correlated with mileage and 6 other fieldsHigh correlation
engine_power is highly correlated with engine_displacement and 1 other fieldsHigh correlation
warranty is highly correlated with mileage and 7 other fieldsHigh correlation
new_car is highly correlated with mileage and 6 other fieldsHigh correlation
mil_per_year is highly correlated with mileage and 2 other fieldsHigh correlation
df_index is highly correlated with data_typeHigh correlation
engine_displacement is highly correlated with engine_powerHigh correlation
mileage is highly correlated with model_date and 6 other fieldsHigh correlation
model_date is highly correlated with mileage and 5 other fieldsHigh correlation
year is highly correlated with mileage and 6 other fieldsHigh correlation
owners is highly correlated with mileage and 6 other fieldsHigh correlation
price is highly correlated with year and 4 other fieldsHigh correlation
data_type is highly correlated with df_indexHigh correlation
state_new is highly correlated with mileage and 6 other fieldsHigh correlation
engine_power is highly correlated with engine_displacement and 1 other fieldsHigh correlation
warranty is highly correlated with mileage and 8 other fieldsHigh correlation
old_car is highly correlated with model_date and 1 other fieldsHigh correlation
new_car is highly correlated with mileage and 6 other fieldsHigh correlation
description_count is highly correlated with owners and 2 other fieldsHigh correlation
mil_per_year is highly correlated with mileage and 2 other fieldsHigh correlation
df_index is highly correlated with data_typeHigh correlation
engine_displacement is highly correlated with engine_powerHigh correlation
mileage is highly correlated with model_date and 6 other fieldsHigh correlation
model_date is highly correlated with mileage and 5 other fieldsHigh correlation
year is highly correlated with mileage and 6 other fieldsHigh correlation
owners is highly correlated with mileage and 5 other fieldsHigh correlation
price is highly correlated with model_date and 2 other fieldsHigh correlation
data_type is highly correlated with df_index and 1 other fieldsHigh correlation
state_new is highly correlated with mileage and 5 other fieldsHigh correlation
engine_power is highly correlated with engine_displacementHigh correlation
warranty is highly correlated with mileage and 5 other fieldsHigh correlation
new_car is highly correlated with mileage and 5 other fieldsHigh correlation
mil_per_year is highly correlated with mileage and 1 other fieldsHigh correlation
drivetrain is highly correlated with body_typeHigh correlation
popular_body is highly correlated with body_typeHigh correlation
transmission is highly correlated with brandHigh correlation
brand is highly correlated with transmission and 1 other fieldsHigh correlation
condition_good is highly correlated with brand and 1 other fieldsHigh correlation
body_type is highly correlated with drivetrain and 2 other fieldsHigh correlation
new_car is highly correlated with state_new and 2 other fieldsHigh correlation
color_top5 is highly correlated with colorHigh correlation
state_new is highly correlated with new_car and 2 other fieldsHigh correlation
owners is highly correlated with new_car and 2 other fieldsHigh correlation
color is highly correlated with color_top5 and 1 other fieldsHigh correlation
color_rare is highly correlated with colorHigh correlation
number_of_doors is highly correlated with body_typeHigh correlation
warranty is highly correlated with new_car and 2 other fieldsHigh correlation
data_type is highly correlated with condition_goodHigh correlation
df_index is highly correlated with brand and 2 other fieldsHigh correlation
body_type is highly correlated with brand and 3 other fieldsHigh correlation
brand is highly correlated with df_index and 7 other fieldsHigh correlation
color is highly correlated with color_top5 and 1 other fieldsHigh correlation
mileage is highly correlated with model_date and 5 other fieldsHigh correlation
model_date is highly correlated with mileage and 6 other fieldsHigh correlation
number_of_doors is highly correlated with body_type and 1 other fieldsHigh correlation
year is highly correlated with model_date and 2 other fieldsHigh correlation
transmission is highly correlated with brand and 1 other fieldsHigh correlation
owners is highly correlated with mileage and 6 other fieldsHigh correlation
vehicle_licence_original is highly correlated with ownersHigh correlation
drivetrain is highly correlated with body_type and 2 other fieldsHigh correlation
condition_good is highly correlated with df_index and 2 other fieldsHigh correlation
data_type is highly correlated with df_index and 2 other fieldsHigh correlation
state_new is highly correlated with mileage and 4 other fieldsHigh correlation
engine_power is highly correlated with brand and 2 other fieldsHigh correlation
warranty is highly correlated with mileage and 5 other fieldsHigh correlation
popular_body is highly correlated with body_type and 2 other fieldsHigh correlation
rarity_car is highly correlated with model_date and 1 other fieldsHigh correlation
old_car is highly correlated with mileage and 2 other fieldsHigh correlation
new_car is highly correlated with mileage and 4 other fieldsHigh correlation
color_top5 is highly correlated with colorHigh correlation
color_rare is highly correlated with colorHigh correlation
description_count is highly correlated with owners and 2 other fieldsHigh correlation
engine_displacement is highly skewed (γ1 = 109.9737064) Skewed
df_index has unique values Unique
mileage has 34721 (14.9%) zeros Zeros
price has 34686 (14.9%) zeros Zeros
description_count has 3307 (1.4%) zeros Zeros
mil_per_year has 34743 (14.9%) zeros Zeros

Reproduction

Analysis started2022-01-30 11:33:08.926434
Analysis finished2022-01-30 11:34:19.083914
Duration1 minute and 10.16 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIQUE

Distinct233417
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean120264.5439
Minimum0
Maximum254262
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size1.8 MiB
2022-01-30T14:34:19.208617image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile11670.8
Q158455
median117096
Q3180518
95-th percentile238989.2
Maximum254262
Range254262
Interquartile range (IQR)122063

Descriptive statistics

Standard deviation71961.12092
Coefficient of variation (CV)0.5983569104
Kurtosis-1.132639057
Mean120264.5439
Median Absolute Deviation (MAD)60767
Skewness0.1185311248
Sum2.807178904 × 1010
Variance5178402923
MonotonicityStrictly increasing
2022-01-30T14:34:19.384769image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
1284041
 
< 0.1%
1651141
 
< 0.1%
1671631
 
< 0.1%
1774041
 
< 0.1%
1794531
 
< 0.1%
1733101
 
< 0.1%
1753591
 
< 0.1%
874241
 
< 0.1%
894731
 
< 0.1%
Other values (233407)233407
> 99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
2542621
< 0.1%
2542611
< 0.1%
2542601
< 0.1%
2542591
< 0.1%
2542581
< 0.1%
2542571
< 0.1%
2542561
< 0.1%
2542551
< 0.1%
2542541
< 0.1%
2542531
< 0.1%

body_type
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
внедорожник
105677 
седан
72691 
хэтчбек
17259 
лифтбек
11836 
минивэн
 
6993
Other values (11)
18961 

Length

Max length11
Median length7
Mean length8.20696436
Min length4

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowлифтбек
2nd rowлифтбек
3rd rowлифтбек
4th rowлифтбек
5th rowлифтбек

Common Values

ValueCountFrequency (%)
внедорожник105677
45.3%
седан72691
31.1%
хэтчбек17259
 
7.4%
лифтбек11836
 
5.1%
минивэн6993
 
3.0%
универсал6261
 
2.7%
купе5196
 
2.2%
компактвэн3922
 
1.7%
пикап2164
 
0.9%
фургон691
 
0.3%
Other values (6)727
 
0.3%

Length

2022-01-30T14:34:19.535075image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
внедорожник105677
45.3%
седан72691
31.1%
хэтчбек17259
 
7.4%
лифтбек11836
 
5.1%
минивэн6993
 
3.0%
универсал6261
 
2.7%
купе5196
 
2.2%
компактвэн3922
 
1.7%
пикап2164
 
0.9%
фургон691
 
0.3%
Other values (6)727
 
0.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

brand
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct36
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
mercedes
26750 
bmw
25581 
toyota
24832 
volkswagen
24379 
nissan
23606 
Other values (31)
108269 

Length

Max length10
Median length6
Mean length6.304318023
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowskoda
2nd rowskoda
3rd rowskoda
4th rowskoda
5th rowskoda

Common Values

ValueCountFrequency (%)
mercedes26750
11.5%
bmw25581
11.0%
toyota24832
10.6%
volkswagen24379
10.4%
nissan23606
10.1%
audi17474
 
7.5%
mitsubishi15042
 
6.4%
skoda13777
 
5.9%
volvo6925
 
3.0%
honda6652
 
2.8%
Other values (26)48399
20.7%

Length

2022-01-30T14:34:19.663573image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
mercedes26750
11.5%
bmw25581
11.0%
toyota24832
10.6%
volkswagen24379
10.4%
nissan23606
10.1%
audi17474
 
7.5%
mitsubishi15042
 
6.4%
skoda13777
 
5.9%
volvo6925
 
3.0%
honda6652
 
2.8%
Other values (26)48399
20.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

color
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
чёрный
70704 
белый
49106 
серый
29413 
серебристый
23361 
синий
21831 
Other values (11)
39002 

Length

Max length11
Median length6
Mean length6.396260769
Min length5

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowсиний
2nd rowчёрный
3rd rowсерый
4th rowкоричневый
5th rowбелый

Common Values

ValueCountFrequency (%)
чёрный70704
30.3%
белый49106
21.0%
серый29413
12.6%
серебристый23361
 
10.0%
синий21831
 
9.4%
красный10806
 
4.6%
коричневый8864
 
3.8%
зелёный5570
 
2.4%
бежевый4923
 
2.1%
голубой3090
 
1.3%
Other values (6)5749
 
2.5%

Length

2022-01-30T14:34:19.783767image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
чёрный70704
30.3%
белый49106
21.0%
серый29413
12.6%
серебристый23361
 
10.0%
синий21831
 
9.4%
красный10806
 
4.6%
коричневый8864
 
3.8%
зелёный5570
 
2.4%
бежевый4923
 
2.1%
голубой3090
 
1.3%
Other values (6)5749
 
2.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

engine_displacement
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct70
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.357599918
Minimum0
Maximum300
Zeros261
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size1.8 MiB
2022-01-30T14:34:19.929779image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1.4
Q11.6
median2
Q32.9
95-th percentile4.4
Maximum300
Range300
Interquartile range (IQR)1.3

Descriptive statistics

Standard deviation1.27350499
Coefficient of variation (CV)0.5401701028
Kurtosis25566.05973
Mean2.357599918
Median Absolute Deviation (MAD)0.4
Skewness109.9737064
Sum550303.9
Variance1.62181496
MonotonicityNot monotonic
2022-01-30T14:34:20.084232image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
260172
25.8%
1.634804
14.9%
324842
10.6%
2.517544
 
7.5%
1.813099
 
5.6%
1.412963
 
5.6%
2.49917
 
4.2%
3.58477
 
3.6%
1.55209
 
2.2%
2.93314
 
1.4%
Other values (60)43076
18.5%
ValueCountFrequency (%)
0261
 
0.1%
0.7573
 
0.2%
0.8364
 
0.2%
1756
 
0.3%
1.177
 
< 0.1%
1.22443
 
1.0%
1.32311
 
1.0%
1.412963
 
5.6%
1.55209
 
2.2%
1.634804
14.9%
ValueCountFrequency (%)
3002
 
< 0.1%
162
 
< 0.1%
8.41
 
< 0.1%
8.21
 
< 0.1%
8.11
 
< 0.1%
7.51
 
< 0.1%
7.41
 
< 0.1%
7.31
 
< 0.1%
73
< 0.1%
6.86
< 0.1%

engine_type
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
gasoline
187242 
diesel
44389 
hybrid
 
1502
electro
 
265
lpg
 
19

Length

Max length8
Median length8
Mean length7.605247261
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowgasoline
2nd rowgasoline
3rd rowgasoline
4th rowgasoline
5th rowgasoline

Common Values

ValueCountFrequency (%)
gasoline187242
80.2%
diesel44389
 
19.0%
hybrid1502
 
0.6%
electro265
 
0.1%
lpg19
 
< 0.1%

Length

2022-01-30T14:34:20.228243image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-30T14:34:20.313736image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
gasoline187242
80.2%
diesel44389
 
19.0%
hybrid1502
 
0.6%
electro265
 
0.1%
lpg19
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

mileage
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct35352
Distinct (%)15.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean120059.4554
Minimum0
Maximum1000000
Zeros34721
Zeros (%)14.9%
Negative0
Negative (%)0.0%
Memory size1.8 MiB
2022-01-30T14:34:20.441403image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q138000
median112000
Q3177799
95-th percentile300000
Maximum1000000
Range1000000
Interquartile range (IQR)139799

Descriptive statistics

Standard deviation99448.91144
Coefficient of variation (CV)0.8283305228
Kurtosis2.962147227
Mean120059.4554
Median Absolute Deviation (MAD)68390
Skewness1.100667999
Sum2.802391789 × 1010
Variance9890085986
MonotonicityNot monotonic
2022-01-30T14:34:20.600669image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
034721
 
14.9%
2000002032
 
0.9%
1800001959
 
0.8%
2500001769
 
0.8%
1500001764
 
0.8%
1600001557
 
0.7%
3000001346
 
0.6%
1700001324
 
0.6%
1000001279
 
0.5%
1200001184
 
0.5%
Other values (35342)184482
79.0%
ValueCountFrequency (%)
034721
14.9%
168
 
< 0.1%
24
 
< 0.1%
33
 
< 0.1%
41
 
< 0.1%
524
 
< 0.1%
613
 
< 0.1%
720
 
< 0.1%
85
 
< 0.1%
914
 
< 0.1%
ValueCountFrequency (%)
100000026
< 0.1%
99999931
< 0.1%
9952001
 
< 0.1%
9937001
 
< 0.1%
9900002
 
< 0.1%
9890001
 
< 0.1%
9850001
 
< 0.1%
9770972
 
< 0.1%
9550001
 
< 0.1%
9498801
 
< 0.1%

model_date
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct80
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2010.310059
Minimum1904
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.8 MiB
2022-01-30T14:34:20.744185image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum1904
5-th percentile1997
Q12006
median2011
Q32016
95-th percentile2019
Maximum2021
Range117
Interquartile range (IQR)10

Descriptive statistics

Standard deviation7.210750565
Coefficient of variation (CV)0.003586884786
Kurtosis4.853439223
Mean2010.310059
Median Absolute Deviation (MAD)5
Skewness-1.399192111
Sum469240543
Variance51.99492371
MonotonicityNot monotonic
2022-01-30T14:34:20.911657image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
201720680
 
8.9%
201416447
 
7.0%
201016301
 
7.0%
201815530
 
6.7%
201515055
 
6.4%
201112535
 
5.4%
200912420
 
5.3%
201311226
 
4.8%
201210862
 
4.7%
200810412
 
4.5%
Other values (70)91949
39.4%
ValueCountFrequency (%)
19042
 
< 0.1%
19082
 
< 0.1%
19273
 
< 0.1%
19322
 
< 0.1%
19342
 
< 0.1%
19351
 
< 0.1%
19368
< 0.1%
19378
< 0.1%
193817
< 0.1%
19481
 
< 0.1%
ValueCountFrequency (%)
2021627
 
0.3%
20208352
3.6%
20197428
 
3.2%
201815530
6.7%
201720680
8.9%
20168451
3.6%
201515055
6.4%
201416447
7.0%
201311226
4.8%
201210862
4.7%

model_name
Categorical

HIGH CARDINALITY

Distinct1143
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
x_trail
 
7783
tiguan
 
6234
a6
 
5326
octavia
 
5200
outlander
 
5128
Other values (1138)
203746 

Length

Max length20
Median length5
Mean length4.96867409
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique130 ?
Unique (%)0.1%

Sample

1st rowoctavia
2nd rowoctavia
3rd rowsuperb
4th rowoctavia
5th rowoctavia

Common Values

ValueCountFrequency (%)
x_trail7783
 
3.3%
tiguan6234
 
2.7%
a65326
 
2.3%
octavia5200
 
2.2%
outlander5128
 
2.2%
camry4946
 
2.1%
54742
 
2.0%
34397
 
1.9%
polo4365
 
1.9%
x54314
 
1.8%
Other values (1133)180982
77.5%

Length

2022-01-30T14:34:21.082054image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
x_trail7783
 
3.3%
tiguan6234
 
2.7%
a65326
 
2.3%
octavia5200
 
2.2%
outlander5128
 
2.2%
camry4946
 
2.1%
54742
 
2.0%
34397
 
1.9%
polo4365
 
1.9%
x54314
 
1.8%
Other values (1133)180982
77.5%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

number_of_doors
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
5.0
145244 
4.0
78244 
2.0
 
5962
3.0
 
3967

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row5.0
2nd row5.0
3rd row5.0
4th row5.0
5th row5.0

Common Values

ValueCountFrequency (%)
5.0145244
62.2%
4.078244
33.5%
2.05962
 
2.6%
3.03967
 
1.7%

Length

2022-01-30T14:34:21.207085image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-30T14:34:21.276669image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
5.0145244
62.2%
4.078244
33.5%
2.05962
 
2.6%
3.03967
 
1.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

year
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct81
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2012.531885
Minimum1904
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.8 MiB
2022-01-30T14:34:21.384391image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum1904
5-th percentile1999
Q12008
median2013
Q32018
95-th percentile2021
Maximum2021
Range117
Interquartile range (IQR)10

Descriptive statistics

Standard deviation6.960645197
Coefficient of variation (CV)0.003458650891
Kurtosis4.225717145
Mean2012.531885
Median Absolute Deviation (MAD)5
Skewness-1.286804337
Sum469759155
Variance48.45058156
MonotonicityNot monotonic
2022-01-30T14:34:21.538491image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
202123766
 
10.2%
202017861
 
7.7%
201815860
 
6.8%
201314772
 
6.3%
201214595
 
6.3%
201114509
 
6.2%
201413619
 
5.8%
200812394
 
5.3%
201611943
 
5.1%
201711771
 
5.0%
Other values (71)82327
35.3%
ValueCountFrequency (%)
19042
 
< 0.1%
19231
 
< 0.1%
19241
 
< 0.1%
19272
 
< 0.1%
19311
 
< 0.1%
19322
 
< 0.1%
19362
 
< 0.1%
19376
< 0.1%
19386
< 0.1%
19394
< 0.1%
ValueCountFrequency (%)
202123766
10.2%
202017861
7.7%
20199541
4.1%
201815860
6.8%
201711771
5.0%
201611943
5.1%
201510764
4.6%
201413619
5.8%
201314772
6.3%
201214595
6.3%

transmission
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
automatic
132859 
mechanical
43904 
variator
30993 
robot
25661 

Length

Max length10
Median length9
Mean length8.615567846
Min length5

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowrobot
2nd rowmechanical
3rd rowrobot
4th rowautomatic
5th rowautomatic

Common Values

ValueCountFrequency (%)
automatic132859
56.9%
mechanical43904
 
18.8%
variator30993
 
13.3%
robot25661
 
11.0%

Length

2022-01-30T14:34:21.678255image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-30T14:34:21.754117image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
automatic132859
56.9%
mechanical43904
 
18.8%
variator30993
 
13.3%
robot25661
 
11.0%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

owners
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
3.0
82971 
1.0
61161 
2.0
54564 
0.0
34721 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
3.082971
35.5%
1.061161
26.2%
2.054564
23.4%
0.034721
14.9%

Length

2022-01-30T14:34:21.843328image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-30T14:34:21.911996image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
3.082971
35.5%
1.061161
26.2%
2.054564
23.4%
0.034721
14.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

vehicle_licence_original
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
1
208644 
0
24773 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1208644
89.4%
024773
 
10.6%

Length

2022-01-30T14:34:21.996461image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-30T14:34:22.062177image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
1208644
89.4%
024773
 
10.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

drivetrain
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
4wd
117210 
fwd
98007 
rwd
18200 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfwd
2nd rowfwd
3rd rowfwd
4th rowfwd
5th rowfwd

Common Values

ValueCountFrequency (%)
4wd117210
50.2%
fwd98007
42.0%
rwd18200
 
7.8%

Length

2022-01-30T14:34:22.134759image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-30T14:34:22.203777image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
4wd117210
50.2%
fwd98007
42.0%
rwd18200
 
7.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
1
227176 
0
 
6241

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1227176
97.3%
06241
 
2.7%

Length

2022-01-30T14:34:22.281647image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-30T14:34:22.349467image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
1227176
97.3%
06241
 
2.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

condition_good
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
1
158018 
0
75399 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1158018
67.7%
075399
32.3%

Length

2022-01-30T14:34:22.422317image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-30T14:34:22.488515image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
1158018
67.7%
075399
32.3%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

price
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct15343
Distinct (%)6.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1547263.344
Minimum0
Maximum99000000
Zeros34686
Zeros (%)14.9%
Negative0
Negative (%)0.0%
Memory size1.8 MiB
2022-01-30T14:34:22.584526image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1310000
median850000
Q32000000
95-th percentile5490000
Maximum99000000
Range99000000
Interquartile range (IQR)1690000

Descriptive statistics

Standard deviation2057326.115
Coefficient of variation (CV)1.329654789
Kurtosis63.20704246
Mean1547263.344
Median Absolute Deviation (MAD)730000
Skewness4.209606183
Sum3.611575679 × 1011
Variance4.232590743 × 1012
MonotonicityNot monotonic
2022-01-30T14:34:22.740479image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
034686
 
14.9%
4500001212
 
0.5%
9960001134
 
0.5%
3500001045
 
0.4%
4000001004
 
0.4%
250000994
 
0.4%
680000975
 
0.4%
550000967
 
0.4%
600000963
 
0.4%
650000942
 
0.4%
Other values (15333)189495
81.2%
ValueCountFrequency (%)
034686
14.9%
160001
 
< 0.1%
184001
 
< 0.1%
200002
 
< 0.1%
210001
 
< 0.1%
230001
 
< 0.1%
240003
 
< 0.1%
250007
 
< 0.1%
2800010
 
< 0.1%
290001
 
< 0.1%
ValueCountFrequency (%)
990000001
< 0.1%
685000001
< 0.1%
654357921
< 0.1%
590875762
< 0.1%
580897202
< 0.1%
545604082
< 0.1%
540574002
< 0.1%
525000001
< 0.1%
480000002
< 0.1%
450000001
< 0.1%

data_type
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
2
109764 
1
88967 
0
34686 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
2109764
47.0%
188967
38.1%
034686
 
14.9%

Length

2022-01-30T14:34:22.870803image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-30T14:34:22.939862image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
2109764
47.0%
188967
38.1%
034686
 
14.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

state_new
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
0
198696 
1
34721 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0198696
85.1%
134721
 
14.9%

Length

2022-01-30T14:34:23.034493image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-30T14:34:23.109121image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
0198696
85.1%
134721
 
14.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

engine_power
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct409
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean191.1406324
Minimum11
Maximum800
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.8 MiB
2022-01-30T14:34:23.215397image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum11
5-th percentile90
Q1130
median170
Q3240
95-th percentile367
Maximum800
Range789
Interquartile range (IQR)110

Descriptive statistics

Standard deviation92.7475183
Coefficient of variation (CV)0.4852318271
Kurtosis4.270263434
Mean191.1406324
Median Absolute Deviation (MAD)48
Skewness1.776144775
Sum44615473
Variance8602.102151
MonotonicityNot monotonic
2022-01-30T14:34:23.383354image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
24917640
 
7.6%
15017078
 
7.3%
1909146
 
3.9%
1106163
 
2.6%
1405307
 
2.3%
1705230
 
2.2%
1055041
 
2.2%
1444522
 
1.9%
1804371
 
1.9%
1814038
 
1.7%
Other values (399)154881
66.4%
ValueCountFrequency (%)
111
 
< 0.1%
171
 
< 0.1%
202
 
< 0.1%
302
 
< 0.1%
322
 
< 0.1%
385
< 0.1%
406
< 0.1%
416
< 0.1%
425
< 0.1%
445
< 0.1%
ValueCountFrequency (%)
8001
 
< 0.1%
7611
 
< 0.1%
7173
 
< 0.1%
7021
 
< 0.1%
7001
 
< 0.1%
6801
 
< 0.1%
6622
 
< 0.1%
6461
 
< 0.1%
63910
< 0.1%
6351
 
< 0.1%

warranty
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
0.0
177946 
1.0
55471 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row0.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0177946
76.2%
1.055471
 
23.8%

Length

2022-01-30T14:34:23.524796image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-30T14:34:23.590960image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
0.0177946
76.2%
1.055471
 
23.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

popular_body
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
1
178368 
0
55049 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
1178368
76.4%
055049
 
23.6%

Length

2022-01-30T14:34:23.663143image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-30T14:34:23.730223image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
1178368
76.4%
055049
 
23.6%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

rarity_car
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
0
233386 
1
 
31

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0233386
> 99.9%
131
 
< 0.1%

Length

2022-01-30T14:34:23.801778image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-30T14:34:23.868950image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
0233386
> 99.9%
131
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

old_car
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
0
221328 
1
 
12089

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0221328
94.8%
112089
 
5.2%

Length

2022-01-30T14:34:23.938858image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-30T14:34:24.005090image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
0221328
94.8%
112089
 
5.2%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

new_car
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
0.0
150005 
1.0
83412 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row1.0
3rd row0.0
4th row0.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0150005
64.3%
1.083412
35.7%

Length

2022-01-30T14:34:24.078230image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-30T14:34:24.145160image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
0.0150005
64.3%
1.083412
35.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

color_top5
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
1
194415 
0
39002 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row0
5th row1

Common Values

ValueCountFrequency (%)
1194415
83.3%
039002
 
16.7%

Length

2022-01-30T14:34:24.216456image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-30T14:34:24.283287image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
1194415
83.3%
039002
 
16.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

color_rare
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.8 MiB
0
229014 
1
 
4403

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0229014
98.1%
14403
 
1.9%

Length

2022-01-30T14:34:24.354171image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-30T14:34:24.430858image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
0229014
98.1%
14403
 
1.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

description_count
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct1144
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean134.3498974
Minimum0
Maximum3506
Zeros3307
Zeros (%)1.4%
Negative0
Negative (%)0.0%
Memory size1.8 MiB
2022-01-30T14:34:24.525090image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile6
Q128
median74
Q3157
95-th percentile555
Maximum3506
Range3506
Interquartile range (IQR)129

Descriptive statistics

Standard deviation172.4259998
Coefficient of variation (CV)1.283409985
Kurtosis7.308393676
Mean134.3498974
Median Absolute Deviation (MAD)54
Skewness2.431216496
Sum31359550
Variance29730.7254
MonotonicityNot monotonic
2022-01-30T14:34:24.673563image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
574377
 
1.9%
103699
 
1.6%
03307
 
1.4%
93209
 
1.4%
992729
 
1.2%
72601
 
1.1%
1612505
 
1.1%
562362
 
1.0%
142273
 
1.0%
192213
 
0.9%
Other values (1134)204142
87.5%
ValueCountFrequency (%)
03307
1.4%
1420
 
0.2%
21724
0.7%
31769
0.8%
41814
0.8%
52145
0.9%
62078
0.9%
72601
1.1%
82177
0.9%
93209
1.4%
ValueCountFrequency (%)
35061
< 0.1%
33981
< 0.1%
23621
< 0.1%
17011
< 0.1%
16321
< 0.1%
16191
< 0.1%
16171
< 0.1%
16131
< 0.1%
15981
< 0.1%
15601
< 0.1%

mil_per_year
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct21572
Distinct (%)9.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12071.67822
Minimum0
Maximum165667
Zeros34743
Zeros (%)14.9%
Negative0
Negative (%)0.0%
Memory size1.8 MiB
2022-01-30T14:34:24.826465image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q17333
median12143
Q316272
95-th percentile25121
Maximum165667
Range165667
Interquartile range (IQR)8939

Descriptive statistics

Standard deviation8312.02682
Coefficient of variation (CV)0.6885560292
Kurtosis6.41765818
Mean12071.67822
Median Absolute Deviation (MAD)4402
Skewness1.163133069
Sum2817734915
Variance69089789.86
MonotonicityNot monotonic
2022-01-30T14:34:24.966275image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
034743
 
14.9%
150001957
 
0.8%
100001507
 
0.6%
110001434
 
0.6%
200001155
 
0.5%
120001101
 
0.5%
166671008
 
0.4%
10500964
 
0.4%
14000903
 
0.4%
18000828
 
0.4%
Other values (21562)187817
80.5%
ValueCountFrequency (%)
034743
14.9%
154
 
< 0.1%
211
 
< 0.1%
37
 
< 0.1%
45
 
< 0.1%
537
 
< 0.1%
617
 
< 0.1%
726
 
< 0.1%
813
 
< 0.1%
918
 
< 0.1%
ValueCountFrequency (%)
1656671
< 0.1%
1641671
< 0.1%
1603331
< 0.1%
1296301
< 0.1%
1262122
< 0.1%
1237501
< 0.1%
1115231
< 0.1%
1105781
< 0.1%
1104111
< 0.1%
1011111
< 0.1%

Interactions

2022-01-30T14:34:14.031475image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:00.905023image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:02.497946image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:04.083339image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:05.667899image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:07.283937image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:08.913227image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:10.486794image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:12.174745image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:14.219509image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:01.085907image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:02.666559image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:04.253364image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:05.842680image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:07.462256image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:09.083481image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:10.668424image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:12.359824image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:14.406486image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:01.260573image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:02.842477image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:04.423921image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:06.020480image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:07.639261image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:09.255532image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:10.853819image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:12.608756image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:14.593063image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:01.442763image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:03.016541image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:04.597000image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:06.191781image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:07.812265image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:09.428526image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:11.034661image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:12.817451image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:14.787975image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:01.618989image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:03.192195image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:04.776613image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:06.373074image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:07.990562image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:09.600982image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:11.219657image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:13.008225image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:14.977221image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:01.798043image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:03.371029image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:04.955812image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:06.551639image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:08.168505image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:09.771432image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:11.407285image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:13.191758image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:15.165838image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:01.968632image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:03.542477image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:05.127186image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:06.729461image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:08.343566image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:09.943175image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:11.591591image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:13.376553image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:15.363061image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:02.149052image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:03.725554image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:05.311189image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:06.917677image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:08.535195image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:10.126970image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:11.788686image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:13.577577image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:15.548258image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:02.323775image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:03.905773image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:05.492707image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:07.104724image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:08.731098image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:10.307181image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:11.986315image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2022-01-30T14:34:13.823249image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Correlations

2022-01-30T14:34:25.130994image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-01-30T14:34:25.436240image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-01-30T14:34:25.739333image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-01-30T14:34:26.034114image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-01-30T14:34:26.319607image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-01-30T14:34:15.993721image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-01-30T14:34:17.470486image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexbody_typebrandcolorengine_displacementengine_typemileagemodel_datemodel_namenumber_of_doorsyeartransmissionownersvehicle_licence_originaldrivetrainsteering_wheel_leftcondition_goodpricedata_typestate_newengine_powerwarrantypopular_bodyrarity_carold_carnew_carcolor_top5color_raredescription_countmil_per_year
00лифтбекskodaсиний1.2gasoline74000.02013.0octavia5.02014.0robot3.01fwd110.000105.00.00000.0109810571.0
11лифтбекskodaчёрный1.6gasoline60563.02017.0octavia5.02017.0mechanical1.01fwd110.000110.00.00001.01026315141.0
22лифтбекskodaсерый1.8gasoline88000.02013.0superb5.02014.0robot1.01fwd110.000152.00.00000.0109812571.0
33лифтбекskodaкоричневый1.6gasoline95000.02013.0octavia5.02014.0automatic1.01fwd110.000110.00.00000.00017013571.0
44лифтбекskodaбелый1.8gasoline58536.02008.0octavia5.02012.0automatic1.01fwd110.000152.00.00000.0102366504.0
55лифтбекskodaсерый2.0gasoline172000.02008.0octavia_rs5.02012.0robot3.01fwd110.000200.00.00000.0104319111.0
66внедорожникskodaпурпурный1.8gasoline107000.02009.0yeti5.02012.0robot1.014wd110.000152.00.01000.0007111889.0
77лифтбекskodaбелый1.6gasoline226706.02008.0octavia5.02011.0mechanical3.01fwd110.000102.00.00000.01022222671.0
88внедорожникskodaбежевый1.4gasoline9706.02016.0kodiaq5.02019.0mechanical1.014wd110.000150.01.01001.0002734853.0
99внедорожникskodaбелый1.8gasoline37361.02009.0yeti5.02012.0mechanical1.014wd110.000152.00.01000.0102264151.0

Last rows

df_indexbody_typebrandcolorengine_displacementengine_typemileagemodel_datemodel_namenumber_of_doorsyeartransmissionownersvehicle_licence_originaldrivetrainsteering_wheel_leftcondition_goodpricedata_typestate_newengine_powerwarrantypopular_bodyrarity_carold_carnew_carcolor_top5color_raredescription_countmil_per_year
233407254253седанmitsubishiсиний2.4gasoline200000.01998.0galant4.02003.0automatic3.01fwd11136000.020144.00.01000.0104310526.0
233408254254внедорожникmitsubishiсиний1.8gasoline100824.02012.0asx5.02014.0variator1.00fwd11959200.020140.00.01000.0109312603.0
233409254255внедорожникmitsubishiсеребристый2.4diesel50403.02015.0pajero_sport5.02018.0automatic1.014wd112114400.020181.00.01001.01013112601.0
233410254256внедорожникmitsubishiсиний2.0gasoline0.02018.0outlander5.02021.0variator0.014wd111641600.021146.01.01001.0105560.0
233411254257внедорожникmitsubishiсеребристый3.0gasoline168153.02006.0pajero5.02007.0automatic3.014wd111076000.020178.00.01000.0105711210.0
233412254258седанmitsubishiчёрный2.0gasoline241800.02007.0lancer4.02007.0mechanical3.01fwd11420000.020150.00.01000.01013716120.0
233413254259внедорожникmitsubishiсеребристый2.4gasoline265000.02014.0outlander5.02014.0variator2.014wd11956400.020167.00.01000.0104233125.0
233414254260внедорожникmitsubishiбелый2.0gasoline0.02018.0outlander5.02021.0variator0.014wd111631340.021146.01.01001.0104850.0
233415254261внедорожникmitsubishiсиний2.4gasoline128000.02009.0outlander5.02011.0variator1.014wd11782400.020170.00.01000.01014111636.0
233416254262седанmitsubishiсеребристый1.6gasoline70643.02011.0lancer4.02014.0mechanical1.01fwd11700400.020117.00.01000.010428830.0